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Brian Fox 
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5 



BACKGROUND OF THE INVENTION 



6 



Field of the Invention 



7 



The present invention relates generally to a 



8 configurable system on a chip (CSoC) , and specifically to 

9 structures and methods regarding the configuration of the 
10 CSoC. 

11 

12 Description of the Related Art 

13 Programmable logic devices, such as field programmable 

14 logic devices (FPLDs) , are programmed to perform user- 

15 specified logic functions by loading configuration data 

16 into the FPLD. This configuration data is typically loaded 

17 into the FPLD as a bitstream, i.e. a string of binary bits. 

18 Each bit programs a specific programmable resource on the 

19 device. Thus, some bits configure the logic blocks which 
2 0 perform the user-defined logic functions, other bits 

21 configure the input /output blocks which interface to 

22 devices external to the FPGA, and yet other bits configure 

23 the programmable interconnect that connects the logic 

24 blocks and the input/output blocks. 

25 Typically, the configuration data is stored in a 

26 nonvolatile memory device and loaded into the FPLD upon 

27 device power-up. The configuration information is loaded 
2 8 into the FPLD in data frames using a shift register. This 

29 shift register has N serially coupled one-bit shift 

30 registers, each register clocked by the same clock signal. 

31 As the bitstream is serially shifted into the shift 
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1 register, the bits of the shift register are bit shifted 

2 downstream in a synchronized manner. 

3 Once the shift register is fully loaded, thereby 

4 indicating a frame of data is complete, the stored N bits 

5 are transferred simultaneously via dedicated lines to some 

6 of the configuration memory cells. Typically, the 

7 configuration memory cells are in an array, wherein a frame 

8 of data corresponds to one column of configuration memory 

9 cells in that array. After the write cycle is complete, 

10 the loading of another frame of configuration data begins. 

11 These store and write cycles continue until all of the bits 

12 of the configuration bitstream are written into the FPLD. 
O 13 Certain programming limitations are inherent in such 
J 14 FPLDs. For example, because each bit corresponds to a 

15 specific programmable resource and each frame of 

ijj 16 configuration data is loaded in a fixed sequence, 

S 17 reconfiguration of many FPLDs is an "all or nothing" 

$ 18 process. In other words, because there is no way to 

n 19 restrict reconfiguration to a part of the device, the 

20 entire bitstream must be reloaded into the FPLD. Moreover, 
03 21 the shift register and dedicated lines take up valuable 

22 silicon real estate. 

23 Therefore, a need arises for a structure and method of 

24 providing programmable logic solutions, while at the same 

25 time ensuring more efficient use of silicon and system 
2 6 resources. 

27 

28 SUMMARY OF THE INVENTION 

2 9 The present invention allows a user to customize the 

3 0 configuration sequence of a configurable system on a chip 

31 (CSoC) , thereby adding considerable flexibility to the 

32 configuration process. The present invention also provides 
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1 certain features, transparent to the user, which optimize 

2 system resources and ensure the correct initialization of 

3 the CSoC. 

4 In accordance with one aspect of the present 

5 invention, the CSoC leverages an on-chip central processing 

6 unit (CPU) to control the configuration process of the 

7 configurable system logic (CSL) . Advantageously, the CSL 

8 configuration memory cells as well as other programmable 

9 locations in the CSoC are addressable as part of a system 

10 bus address space, thereby eliminating the dedicated 

11 configuration shift register of prior art FPLDs. 

12 Moreover, the present invention provides that this 

13 system bus is a multi-use structure. Thus, the system bus 

14 of the CSoC may be used for both configuring and reading of 

15 memory cells. In this manner, the present invention 

16 substantially eliminates the configuration interconnect 

17 used in prior art FPLDs, thereby optimizing system 

18 resources . 

19 in one embodiment, an on-chip direct memory access 
2 0 (DMA) controller controls the bus for configuring of the 

21 memory cells. Using the DMA controller as a bus master 

22 rather than the CPU reduces the number of clock cycles 

23 needed for configuration, thereby reducing power 

24 consumption. In another embodiment, the CPU controls the 

25 bus during both configuration and read modes, thereby 

26 reducing the number of on-chip components. 

27 The present invention provides significant latitude to 

28 the user to customize the configuration process. For 

29 example, the user can select one clock during configuration 

30 of the CSL and another clock after configuration of the 

31 CSL. In one embodiment, the CPU initially sets key timing 

32 parameters to facilitate accessing the slowest available 
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1 parallel memory device on the market and then accelerates 

2 the access as appropriate for the specific memory device or 

3 as selected by the user. Thus, the present invention 

4 advantageously provides the user maximum flexibility in the 

5 selection of the clock source, both in the configuration 

6 mode as well as in the user mode. 

7 In accordance with the present invention, the user can 

8 also determine the initialization sequence of the CSoC. 

9 For example, global buffers used for clock, high fan-out, 

10 or reset signals can be selectively enabled in the sequence 

11 designated by the user. Other elements of CSoC, including 

12 but not limited to, registers, programmable inputs /out puts, 

13 and RAM can be selectively enabled in the same manner. 

14 The present invention provides certain transparent 

15 features that provide optimal flexibility to the user. For 

16 example, an initialization sequence first checks for either 

17 a serial or a parallel external memory device coupled to a 

18 memory interface unit (M1U) in the CSoC. The CPU sets the 

19 MIU to an appropriate mode based on the type of external 

20 memory device. Then, the CPU searches for a header in the 

21 memory device. If the header is found, then the CSL is 

22 configured. If the header is not found, the MIU is 

23 switched to another mode. In one embodiment, if the header 

24 is not found in the new mode, then the present invention 

2 5 checks whether the memory device is accessible, powers down 

26 the CSoC if the memory device is accessible, and repeats 

27 the header search if the external memory device is not 

28 accessible. 

2 9 Another aspect of the present invention is providing 

30 predetermined signals in the CSL until the completion of 

31 configuration. Special buffers in the general interconnect 

32 of the CSL, called zip buffers, provide this function. 
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1 Zip buffers include first-tier and second tier 

2 multiplexers. Each first-tier multiplexer has input 

3 terminals coupled to a set of general interconnect lines. 

4 Each of the second-tier multiplexers has a first input 

5 terminal coupled to one output terminal of the first -tier 

6 multiplexers and a second input terminal coupled to a 

7 constant voltage source. The second-tier multiplexers 

8 selectively provide output signals from the first terminals 

9 or the second terminals to the logic block. Specifically, 

10 during the configuration mode, the second- tier multiplexers 

11 provide only constant signals to the logic block, thereby 

12 preventing contention within the logic block. During the 

13 user mode, the second- tier multiplexers provide signals 

14 from the general interconnect to the logic block. 

15 The present invention also provides a unique input 

16 multiplexer to receive the signals provided by the zip 

17 buffers. The input multiplexer includes a first 

18 multiplexer that receives a first plurality of input 

19 signals and a constant signal, a second multiplexer that 

20 receives a second plurality of input signals and a mode 

21 signal, and a third multiplexer that receives output 

22 signals from the first and second multiplexers and provides 

23 an output signal on a logic block input line. The input 

24 multiplexer further includes a transistor coupled to the 

25 internal logic block line, wherein the first transistor is 

26 controlled by the mode signal. In one embodiment, the mode 

27 signal, which determines whether the logic block is in a 

2 8 configuration mode or a user mode, is generated by control 
2 9 logic in the CSL. 

30 During configuration, the mode signal, the constant 

31 signal, and all input signals are identical. To provide 

32 further protection to the internal circuits of the CSL, the 
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1 mode signal also turns on the transistor coupled to the 

2 internal logic block line. The transistor, which is 

3 further coupled to the constant signal, thereby provides 

4 this constant signal to the internal logic block line. In 

5 this manner, the present invention ensures that the 

6 internal logic of the CSL remains stable until 

7 configuration is complete. 

8 Because the configuration memory is randomly 

9 accessible, just as any other memory on the CSoC, the 

10 configuration memory can be addressed in any order, thereby 

11 providing enhanced security and debug capabilities to the 

12 user. Additionally, the configuration memory can be easily 

13 addressed as subsets, thereby facilitating rapid 

14 reconfiguration. In one embodiment, a user may choose 

15 between various levels of security on the CSoC. 

16 The present invention will be more fully understood in 

17 light of the following description and drawings. 
18 

19 BRIEF DESCRIPTION OF THE DRAWINGS 

2 0 Figure 1 illustrates a configurable system on a chip 

21 (CSoC) in a typical system environment. 

22 Figure 2 shows a block diagram of the basic structures 

23 in a CSoC. 

24 Figure 3, comprising Figures 3A, 3B, and 3C, 

25 illustrates a flow chart of the configuration of a CSoC in 

26 both a parallel and series mode. 

27 Figure 4 shows a block diagram of the configurable 

28 system logic including logic banks, horizontal breakers, 

2 9 vertical breakers, corner breakers, and interface logic. 

3 0 Figure 5 illustrates further details of a logic bank, 
31 a horizontal breaker, and a vertical breaker. 
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1 Figure 6 shows a vertical breaker tile including 

2 multi-purpose buses. 

3 Figure 7A illustrates one embodiment of zip buffers 

4 controlled by configuration memory cells. 

5 Figure 7B shows one embodiment of an input multiplexer 

6 controlled by configuration memory cells. 
7 

8 DETAILED DESCRIPTION OF THE DRAWINGS 

9 A configurable system on a chip (CSoC) is a 

10 monolithic, integrated circuit device that performs a 

11 variety of microcontroller functions. Figure 1 illustrates 

12 a CSoC 101 in a typical system environment 100. In this 
Q 13 environment, a memory device 103, after receiving the 

'% 14 appropriate chip enable (CE) , write enable (WE) , and output 

^ 15 enable (OE) signals from CSoC 101, provides CSoC 101 with 

!jj 16 the configuration data for its programmable resources. In 

;J 17 one embodiment, memory device 103 is a flash read only 

18 memory (ROM) which provides its output data in parallel. 
m 19 Note that other CSoC system embodiments may use other 

\t 20 memory devices for configuration. These memory devices may 

J 21 be implemented in other technologies, such as erasable 

" W 22 programmable read only memory (EPROM) or electrically 

23 erasable read only memory (EEPROM) technologies. Moreover, 

24 as described in more detail in reference to Figures 3A-3C, 

2 5 the memory device may be a serial device instead of a 

26 parallel device. 

27 CSoC 101 is also connected to a JTAG connector 102. 

28 JTAG, an acronym for the Joint Test Action Group, refers to 

29 the IEEE Std 1149.1-1990 which defines a test access port 

3 0 and boundary scan architecture for digital integrated 

31 circuits. This standard is well known to those skilled in 

32 the art and therefore is not described in detail herein. 
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1 Figure 2 illustrates the major structures of CSoC 101. 

2 CSoC 101 includes configurable system logic (CSL) 201, a 

3 programmable logic section to implement user-defined logic. 

4 CSL 201 is programmed by loading configuration memory cells 

5 (not shown) . These configuration memory cells control the 

6 logic and general interconnect of CSL 201. A memory 

7 interface unit (MIU) 204 facilitates the transfer of the 

8 logic values for the configuration memory cells from an 

9 external memory (for example, memory device 103 (Figure 1)) 

10 to CSL 201 via a system bus 205. A random access memory 

11 (RAM) 202 provides user memory in addition to that provided 

12 by CSL 201. Programmable inputs/outputs (PIOs) 201A 

□ 13 provide connection between CSL 201 and other resources (not 

; | 14 shown) external to CSoC 101. 

h*' 15 in accordance with the present invention, CSoC 101 

hj 16 advantageously uses an on-chip central processing unit 

2 17 (CPU) 206 during the configuration process. Specifically, 

* 18 CPU 206 executes a software program to write configuration 

S 19 data to the configuration memory cells of CSoC 101. Note 

^ 2 0 that the configuration memory cells are found principally, 

J2 21 but not exclusively, in CSL 201. Specifically, 

:4i 22 configuration memory cells are also included in PIOs 201A. 

23 Figure 3, which comprises Figures 3A-3C, is a flow 

24 chart of the configuration process in CSoC 101. After 

25 power-on in step 301, MIU 204 (Figure 2) initializes as a 

26 multi-chip slave (step 302) that can be accessed by 

27 multiple external memory devices. If CSoC 101 is a slave 
2 8 chip, as indicated by a logic one signal on the SLAVE pin 

29 (not shown in Figure 1), CPU 206 is held in reset (step 

30 303) . If the signal on the SLAVE pin is a logic zero, 

31 thereby indicating that CSoC 101 is the master chip, then 

32 CPU 206 takes control and begins to execute the 
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1 initialization code stored in non-volatile memory 203 (step 

2 304) . In one embodiment, the initialization code is 

3 divided between non-volatile memory 203 (internal) and 

4 memory device 103 (external) , wherein non-volatile memory 

5 203 stores only a standard start-up sequence and memory 

6 device 103 stores the user-specific code (described in 

7 detail in reference to steps 318, 319 and 325) . 

8 In step 305, CPU 206 determines if CSoC 101 should 

9 stay in a secure mode. In Figure 2, there are two points 

10 of access to CSoC 101: JTAG connector 102 and memory device 

11 103 . Other environments may provide fewer or more points 

12 of access. In accordance with the present invention, a 

13 user may choose between various levels of security. 

14 Specifically, in Figure 2, JTAG connector 102 or memory 

15 device 103 may be selectively disconnected (individually or 

16 in combination) from or remain connected to CSoC 101. This 

17 security designation is translated into executable code and 

18 stored in memory device 103. As indicated previously, CPU 

19 206 may read a portion of memory device 103 in step 304. 

20 In one embodiment, the user directs CPU 206 to deny access 

21 to JTAG connector 102 as well as to external memory 103, 

22 thereby preventing any readback of the user's design. If 

23 memory device 103 is disconnected, then CSoC 101 must have 

24 been previously programmed, i.e. all values already loaded 

25 into the configuration memory cells, and a reset function 

26 triggered. If CSoC 101 should be in a secure mode, CPU 206 

27 executes the code stored in RAM 202 in step 305A. 

28 Otherwise, CSoC 101 enters a non-secure mode 306. 

29 A non-secure mode indicates that CSoC 101 will operate 

30 in conjunction with external resources, such as memory 

31 device 103. Therefore, in step 307, CPU 206 checks whether 

32 memory device 103 is a parallel or serial device. As shown 
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1 in Figure 1, memory device 103 is a parallel device. Thus, 

2 at this point, CSoC 101 would enter a parallel mode 308. 

3 In step 309, MIU 204 is set in parallel mode. 

4 Specifically, in one embodiment, CPU 2 06 initially sets key 

5 timing parameters to facilitate accessing the slowest 

6 available memory device on the market and then accelerates 

7 the access as appropriate for the specific memory device. 

8 In step 310, CPU 206 searches in memory device 103 for 

9 a valid identification (ID) header. In accordance with the 

10 present invention, multiple configuration files may be 

11 stored in memory device 103. These files may be associated 

12 with different devices or different functions performed by 

13 CSoC 101. The ID header identifies a valid configuration 

14 file in memory device 103. 

15 if a header is not found in step 311, then CPU 206 

16 determines whether a signal on the VSYS pin of CSoC 101 is 

17 high in step 312. The VSYS pin is connected to the power 

18 supply of devices external to CSoC 101, such as memory 

19 device 103. If the power supply to CSoC 101 (pin VCC) is, 

20 for example, 3.3 volts and the power supply to memory 

21 device 103, is 5 volts, then memory device 103 might not be 

22 fully powered up and thus not accessible at the time CPU 

23 2 06 searches in memory device 103 for a valid ID header. 

24 The present invention ensures that CPU 206 is given an 

25 opportunity to search for the header when memory device 103 
2 6 is accessible. Therefore, if the VSYS signal is low in 

27 step 312, CPU 206 returns to non-secure mode 306 to begin 

28 another read cycle. 

2 9 On the other hand, if the VSYS signal is high, then 

30 memory device 103 is accessible and CPU 206 should have 

31 been able to find a valid ID header. Therefore, either a 

32 valid ID header is not present or the data stored in memory 
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1 device 103 is corrupted. Either case requires user 

2 intervention or correction before the configuration can 

3 continue. Therefore, CSoC 101 is powered down in step 313. 

4 Once powered down, CPU 2 06 waits until it receives a reset 

5 (RST) signal from the user in step 314. At this point, CPU 

6 206 returns to non-secure mode 306. 

7 If a header is found in step 311, then CPU 206 begins 

8 code execution (step 317) from memory device 103 at the 

9 address indicated by the header. In one embodiment, the 

10 header includes executable code to trigger this CPU 

11 process. 

12 After code execution begins, CPU 206 selects the user- 

13 designated clock source in step 318. CSoC 101 may include 

14 an internal ring oscillator (not shown) , or may be coupled 

15 to multiple external clock sources (such as a crystal 

16 oscillator (Figure 1) via terminals XTAL1 and XTAL2) . 

17 Selection between multiple clock sources is generally 

18 implemented with a multiplexer as shown by clock control 

19 208 in Figure 2. This choice of clock sources provides 
^ 20 optimal flexibility in configuration speed to the user. 

J3 21 In step 319, CPU 206 executes a routine, if necessary, 

M 22 which synchronizes external system resources (if present) 

23 connected to system bus 205. Generally, the configuration 

24 process requires significant data transfers. The present 

25 invention advantageously provides a selectable interrupt or 

26 wait command in step 319 before this data transfer occurs. 

27 In one embodiment, CPU 206 waits for authorization from 
2 8 these external system resources before continuing the 

29 configuration process. For example, if external memory 103 

30 is a dual-ported memory, then in step 319 CPU 206 may 

31 receive an instruction to wait until that dual -ported 

32 memory is fully loaded. 
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1 In step 32 0, CPU 206 sets up the parameters for a DMA 

2 transfer, then DMA controller 207 takes over as the master 

3 of system bus 205. Using DMA controller 207 as a bus 

4 master rather than CPU 2 06 reduces the number of clock 

5 cycles needed for configuration, thereby reducing power 

6 consumption. At this point, DMA controller 207 uses system 

7 bus 2 05 and MIU 2 04 to configure the internal programmable 

8 resources of CSL 210 and PIOs 201A. Additional internal 

9 programmable resources of CSoC 101, such as the control 

10 register unit (CRU) which is distributed throughout CSoC 

11 101, may also be programmed at this time. The CRU is the 

12 repository of all control bits for semi-custom gates. 

■;2 13 Thus, the sequencing that will be discussed in reference to 

p 14 step 325, for example, is controlled by the CRU. Either 

K 15 CPU 206 or DMA controller 207 programs the CRU. 
\7i 16 Note that the present invention allows other bus 

;2 17 masters to configure CSoC 101. As noted previously, system 

18 bus 205 is a multi-master bus. Therefore, in another 

S 19 embodiment of the present invention, CPU 206 performs the 

M; 20 configuration of CSL 210 and PIOs 201A. In yet another 

;J3 21 embodiments, an external device to CSoC 101, such as 

^ 22 another CSoC or a memory device containing executable code, 

23 communicates with CPU 2 06 via MIU 2 04 and system bus 2 05, 

24 becomes master of system bus 205, and continues the 

25 configuration of CSoC 101. In yet a further embodiment, an 

26 external device communicates with CPU 2 06 via JTAG 

27 interface 102 and system bus 205, becomes master of system 

28 bus 205, and continues the configuration of CSoC 101. For 
2 9 purposes of illustration only, the following description 

30 assumes that DMA controller 207 is bus master starting in 

31 configuration step 320. 
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During the transfer of data from memory device 103 to 
the above-mentioned internal resources, DMA controller 207 
runs a verify CRC checksum algorithm in step 321 to improve 
the integrity of those data signals. This CRC checksum 
algorithm and the use of DMA controller 207 to run this 
algorithm are well known in the art and therefore are not 
explained in detail herein. 

Assuming all data is transferred successfully, as 
verified by DMA controller 2 07 in step 321, CPU 2 06 enters 
the post CSL configuration setup stage 322. In this stage, 
other user-specified data may be loaded into RAM 2 02 (step 
323) . This user-specified data may include values for 
lookup tables (LUTs) and registers in CSL 201, specific 
configuration memory locations in CSL 201, the CRU, or 
programs to be run by CPU 2 01 in the user mode. 

In one embodiment, the values written to RAM 2 02 
include information for partial reconfiguration of the 
programmable resources in CSL 201 during the user mode. 
Because many programmable logic designs include repetitive 
sections, this aspect of the present invention saves 
considerable silicon real estate and time. 

At this point, CSoC 101 is essentially programmed. 
However, before entering the user mode, the clock source of 
CSoC 101 may again be chosen as indicated in step 324. 
This step, performed by CPU 206, allows the user to select 
different clock sources for the configuration mode and the 
user mode. For example, some users prefer a fast 
configuration mode, but need a low power (i.e. slower) user 
mode. Other users have no timing requirements during 
configuration, but want to maximize system performance 
during the user mode, for example by using high speed clock 
generation circuitry including a phase locked loop. Thus, 
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1 the present invention advantageously provides the user 

2 maximum flexibility in the selection of the clock source, 

3 both in the configuration mode as well as in the user mode. 

4 In accordance with the present invention, step 325 

5 ensures proper system synchronization by allowing the user 

6 to determine the enablement sequence of various elements of 

7 CSoC 101. In one embodiment, the enablement sequence is 

8 provided by the user during the capture of the user's logic 

9 design and the appropriate instructions are generated by 

10 the programming software and loaded into memory device 103. 

11 Alternatively, the user may provide proprietary executable 

12 code including the instructions for the enablement 

13 sequence. In either case, these instructions are read by 

14 CPU 2 06, which controls the synchronization process. The 

15 following areas of selective enablement may be included in 

16 step 325: CSL zip buffers and input multiplexers 

17 (explained in detail in reference to Figures 7A and 7B) , 

18 global buffers (used for clock, high fan-out, or reset 

19 signals), registers (i.e. flip-flops), and lookup table 

20 (LUT) RAM in CSL 201. 

21 CPU 206 also controls the selective enabling of PIOs 

22 2 01A in step 325, thereby providing certain output signals 

23 before other output signals from CSoC 101. A more detailed 

24 description of PIOs 201A is provided in U.S. Serial No. 

25 (TRI-004), filed on October 15, 1999 by Triscend 

2 6 Corporation, and entitled "An Input /Output Circuit With 

27 User Programmable Functions", which is incorporated by 

28 reference herein. 

2 9 In one embodiment, the zip buffers are enabled first, 

3 0 followed by the global buffers, the registers, and the PIOs 

31 (a typical programmable logic device sequence) to ensure 

32 that CSoC 101 drives the proper output signals to the 
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1 external system. In another embodiment, the registers and 

2 PIOs of CSoC 101 are enabled and the system external to 

3 CSoC 101 is allowed to stabilize before the zip and global 

4 buffers are enabled. 

5 Finally, also during step 325, CPU 206 allows for the 

6 inclusion of customer, i.e. executable, code into the 

7 sequence, thereby providing optimal flexibility to the 

8 user. 

9 In step 326, the user may selectively put CSoC 101 

10 into a secure mode. In this manner, CSoC 101 will not read 

11 from any external devices unless a power-down cycle is 

12 triggered. In one embodiment, the secure mode allows 

13 reading from only certain devices, such as memory device 

14 103 or JTAG connector 102, or neither. In the secure mode, 

15 if a reset signal is received, CPU 206 returns to step 305A 

16 and executes the user code from RAM 202. 

17 If a non-secure mode is chosen, then in step 327, 

18 program control is transferred to the user code residing in 

19 memory device 103 (note that even in a non- secure mode, CPU 

20 206 can still access non-volatile memory 203) . Accessing 
fj 21 memory device 103, although somewhat slower and higher in 

22 power consumption because of toggling input /output 

23 circuits, allows access to significantly larger memory 

24 resources. The configuration mode ends in step 32 8, 

25 thereby triggering the user mode. 

26 As described in further detail below, the present 

27 invention advantageously not only allows multiple memory 

28 devices to be connected to CSoC 101, but also allows 

2 9 various types of memory devices to be accessed, thereby 

3 0 significantly increasing user flexibility. For example, 

31 returning to step 307, if the external memory has a serial 

32 interface, then a serial mode begins in step 330. In this 
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1 mode, CPU 206 sets MIU 204 in the serial mode (step 331) , 

2 resets the address pointer in the external memory device 

3 (step 332) , and then searches for a valid ID header in that 

4 memory device (step 333) . If a valid ID header is not 

5 found, CPU 2 06 will assume another type of memory device is 

6 connected to CSoC 101, and will automatically switch over 

7 to the parallel mode 308. However, if the valid ID header 

8 is found in the external memory device, then the serial 

9 mode is continued in step 334. Specifically, the program 

10 portion of the code is transferred in step 335 from the 

11 external memory device to RAM 202 and executed by CPU 206 

12 in step 336. At this point, CPU 206 continues the 

13 configuration process as indicated in steps 318-328. 
14 

15 System Bus Resources 

16 In accordance with the present invention, system bus 

17 2 05 may be used for both configuration data transfer and 

18 general interconnect. This dual use provides significant 

19 silicon savings compared to conventional field programmable 
2 0 logic devices that require dedicated wires on chip to 

21 connect the external memory device to the configuration 

22 memory cells. 

23 Figure 4 illustrates one embodiment of CSL 201 in 

24 which a plurality of logic banks 401 are arranged in rows 

25 and columns. Relative to every logic bank 401, a vertical 

26 breaker 402 is positioned on the top edge, a horizontal 

27 breaker 403 is positioned on the left edge, and a corner 

2 8 breaker 4 04 is positioned in the left corner. In this 

29 embodiment, interface logic 405 distributes address signals 

3 0 down every column defined by corner breakers 4 04 and 
31 horizontal breakers 403. 
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1 Specifically referring to Figure 5, from each corner 

2 breaker 404, the address signals sw_adr [ 3 1 : 0 ] are 

3 distributed to the horizontal breaker 4 03 immediately below 

4 the corner breaker 404. These address signals are then 

5 distributed to the logic bank 401 to the right of 

6 horizontal breaker 403. Selection of the configuration 

7 memory cells within each logic block tile 501 is performed 

8 using well known circuitry and methods and therefore is not 

9 discussed in detail herein. 

10 Note that each logic bank 401 comprises an array of 

11 logic block tiles 501, each horizontal breaker 403 

12 comprises an array of horizontal breaker tiles 503, and 

13 each vertical breaker 4 02 comprises an array of vertical 

14 breaker tiles 502. Breaker tiles (both horizontal and 

15 vertical) include programmable connections between certain 

16 general interconnect lines (long lines) in adjacent banks, 

17 structures for distributing power, ground, and clock 

18 signals, as well as system-level circuitry to integrate CSL 

19 201 with other components in CSoC 101, such as CPU 206. 

20 Interface logic 405 (Figure 4) distributes write data 

21 signals sw_dws down every column defined by vertical 

22 breaker tiles 502 and logic bank tiles 501. In contrast, 

23 the read data signals sr_drs are propagated up through this 

24 same column to interface logic 405. 

25 In accordance with the present invention, the buses 

26 for distributing the write data and read data signals can 

27 be used for both configuration memory and user data, 

2 8 thereby providing an extremely efficient and flexible 

29 resource on CSoC 101. Figure 6 illustrates one embodiment 

30 of a vertical breaker tile 502 including such dual-purpose 

31 buses. Specifically, bus 601 distributes the write data 

32 signals sw_dws . Registers 602 provide a registered version 
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of those signals (fw_dws) to the configuration memory cells 
in the appropriate logic block tiles 501. In this 
embodiment, the same data is provided to every logic block 
tile 501 in the column of tiles, thereby providing 
flexibility in the placement of resources connected to bus 
601. Row select circuitry, well known to those skilled in 
the art, is used to determine which of those logic block 
tiles 501 should receive the configuration data. 

In a similar manner, read data bus 611 can carry 
signals from the general interconnect via bus 605 or from 
the configuration memory cells via bus 604. Specifically, 
multiplexers 606 selectively provide general interconnect 
signals fr_drs or configuration memory signals n_lrdata. 
The selected signals are stored in registers 607, then 
provided to multiplexers 608. Although not used if 
configuration memory signals are selected, multiplexers 608 
allow an arbitrary 4-way permutation of the assignment of 
the read data signals to facilitate placement and routing 
of user programmable logic. Multiplexers 609 receive the 
output signals of multiplexers 608 as well as the signals 
from registers 607 provided directly on bypass lines. If 
the signals in registers 607 are configuration signals, 
then multiplexers 609 are programmed to transfer the 
signals on the bypass lines. The output signals of 
multiplexers 609 are combined via OR gates 610 with the 
corresponding read data signals from an adjacent logic 
block tile in the same column, thereby forming bus 611. 
Thus, depending on the signal selection by multiplexers 
606, bus 611 can be used for general interconnect in CSL 
201 or for configuration read lines. 

Determining the states of the configuration memory 
cells in logic block tiles 501 is highly advantageous in 
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1 debug operations. For a more detailed description of such 

2 debug operations, see U.S. Serial No. (TRI-002) , 

3 entitled "Bus Mastering Debugging System For Integrated 

4 Circuits", filed by Triscend Corporation on October 15, 

5 1999, and incorporated by reference herein. 
6 

7 Zip Buffers and Input Multiplexers 

8 The present invention provides programmable zip 

9 buffers and input multiplexers to the logic block tiles to 

10 eliminate contention during configuration. Figure 7A 

11 illustrates zip buffers 701A-701D having a two-tier 

12 multiplexer structure in which signals from various lines 

13 in the general interconnect (single length lines and/or 

14 long lines) of CSL 201 are selectively chosen via 

15 multiplexers 702 and provided to multiplexers 703. If a 

16 logic one signal is provided to the control terminals of 

17 multiplexers 703, the interconnect signals are buffered and 

18 transferred to an input multiplexer 704 of a logic block 

19 tile 705. Depending on the logic states of memory cells 

20 706, a selected interconnect signal is buffered and then 

21 provided on line 707. 

22 In accordance with the present invention, CPU 2 06 

23 ensures a logic zero Enzip signal is provided to the 

24 control terminals of multiplexers 703 until step 325. In 

25 this manner, only logic zeros are provided to all logic 

26 block tiles 705 (via input lines 707) , thereby ensuring a 

27 clean start-up of CSL 201, 

28 Figure 7B illustrates an embodiment of an input 

29 multiplexer 704A for a logic block tile. In this 

30 embodiment, two 5 to 1 multiplexers 710A and 710B each 

31 receive four input signals in0-in3. This embodiment 

32 assumes that all signals output from zip buffers 701 
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1 (Figure 7A) are inverted. Thus, all input signals in0-in3 

2 to multiplexers 710 are logic ones. Multiplexer 710A has 

3 one input terminal connected to voltage VCC, whereas 

4 multiplexer 710B has one input terminal connected to 

5 receive the inverted Enzip signal (EnzipN) . Note that 

6 memory cells 702A-702D selectively pass signals in0-in3, 

7 respectively, whereas memory cell 702E selectively passes 

8 signal in4 . 

9 During configuration a logic one signal is provided to 

10 the gate of transistor 711 (Enzip = 0, EnzipN = 1), thereby 

11 turning on that transistor. Because transistor 711 is 

12 coupled to voltage VCC, a logic one signal is provided on 

13 line 712, the input line to a logic block tile (discussed 

14 in reference to Figure 5) . Thus, irrespective of the 

15 programming of memory cells 702A-702G, the signal on line 

16 712 during configuration is constant (i.e. a logic one 

17 signal) . Thus, the present invention advantageously 

18 eliminates any possibility of contention on line 712 to the 

19 logic block tile. Additionally, the present invention 

2 0 ensures that the internal logic of the CSL remains stable 

21 until configuration is complete. 

22 Note that after configuration, the EnzipN signal 

23 mimics as a supply voltage, i.e. a constant logic zero 

24 signal. Thus, in the user mode, the present invention 
2 5 ensures a different, constant signal source for 

26 multiplexers 710A and 710B. 
27 

28 Synopsis 

2 9 As mentioned previously, prior art FPLDs have 

3 0 dedicated wires to provide stored values to the 

31 configuration memory cells. Moreover, the prior art FPLDs 

32 provide dedicated hardware (i.e. shift registers) to load 



20 



T 



TRI-003 PATENT 

1 those values into the configuration memory cells. Both of 

2 these dedicated resources are used only during 

3 configuration. 

4 In contrast, the present invention provides an 

5 extremely efficient use of system resources. As described 

6 above, the data buses provided in CSL 2 01 (and forming part 

7 of the system bus) can be multi-purpose, i.e. used for both 

8 configuration and user logic in the general interconnect. 

9 Moreover, the use of CPU 2 06 in the configuration process 

10 eliminates the need for dedicated hardware for memory cell 

11 loading. 

12 As a further advantage of the present invention, all 

13 configuration memory locations in CSL 2 01 and storage 

14 locations in RAM 202 are mapped into the addressable memory 

15 space of system bus 205. This mapping provides significant 

16 advantages to the user. For example, any master of system 

17 bus 205 can read from or write to specific locations, 

18 thereby enhancing debug operations. 

19 Thus, the present invention provides a silicon 

2 0 efficient programmable logic solution while at the same 

21 time providing users optimal flexibility in configuration 

22 and system resources. 
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1 CLAIMS 
2 

3 1. A method of using a system bus on a configurable system 

4 on a chip (CSoC) , the CSoC including configurable system 

5 logic (CSL) , the method comprising: 

6 configuring a memory cell in the CSL using the system 

7 bus ; and 

8 reading the memory cell in the CSL using the system 

9 bus. 
10 

11 2. The method of Claim 1 wherein the system bus is 

12 controlled by a direct memory access (DMA) controller 

13 during the step of configuring. 
14 

15 3 . The method of Claim 1 wherein the system bus is 

16 controlled by a central processing unit (CPU) during the 

17 steps of configuring and reading . 
18 

19 4. A method of initializing a configurable system on a 

2 0 chip (CSoC) , the CSoC including configurable system logic 

21 (CSL) , the method comprising: 

22 selecting one of a plurality of clocks before 

23 configuring the CSL; 

24 configuring the CSL; and 

25 selecting one of the plurality of clocks after 

26 configuring the CSL. 
27 

28 5. A method of initializing a configurable system on a 

2 9 chip (CSoC) , the CSoC including configurable system logic 

3 0 (CSL) , the method comprising: 
31 
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1 checking for one of a serial and a parallel output 

2 device external to the CSoC and coupled to a memory 

3 interface unit (MIU) in the CSoC; 

4 setting the MIU to an appropriate mode based on the 

5 output device ; 

6 searching for a header in the output device; 

7 switching to another mode if the header is not found ; 

8 and 

9 configuring the CSL if the header is found. 
10 

11 6. The method of Claim 5, wherein if the header is not 

12 found in the other mode, then further including: 

13 checking whether the output device is accessible; 

14 powering down the CSoC if the output device is 

15 accessible ; and 

16 repeating the steps in Claim 5 if the output device is 

17 not accessible. 
18 

19 7. A programmable interconnect structure providing a 

2 0 signal to a logic block, the interconnect structure 

21 comprising: 

22 a plurality of interconnect lines divided into sets of 

23 interconnect lines; 

24 a plurality of first-tier multiplexers, each first - 

25 tier multiplexer having input terminals coupled to a set of 

26 interconnect lines and having an output terminal; 

27 a plurality of second- tier multiplexers, each second- 
2 8 tier having a first input terminal coupled to one output 

2 9 terminal of the first -tier multiplexers and a second input 

3 0 terminal coupled to a constant voltage source, 
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1 wherein the plurality of second-tier multiplexers 

2 selectively provide output signals from the first terminals 

3 or the second terminals to the logic block. 
4 

5 8. A method of providing signals to a logic block, the 

6 method comprising: 

7 selectively providing one of a plurality of 

8 interconnect signals and a constant logic signal to the 

9 logic block, 

10 wherein during a configuration mode, the constant 

11 logic signal is provided to the logic block, and 

12 wherein during a user mode, the plurality of 

13 interconnect signals are provided to the logic block. 
■*D 14 

j4 15 9. An input multiplexer to a logic block, the input 

16 multiplexer comprising: 
SI 17 a first multiplexer receiving a first plurality of 

~ 18 input signals and a constant signal; 

j!T 19 a second multiplexer receiving a second plurality of 

U 20 input signals and a mode signal; 

^ 21 a third multiplexer receiving output signals from the 

; IJ 22 first and second multiplexers and providing an output 

23 signal on a logic block input line; and 

24 a first transistor coupled to an internal logic block 

25 line, the first transistor controlled by the mode signal. 
26 

27 10. A method of providing signals on an internal logic 

28 block line of a logic block, the method comprising: 

29 providing a plurality of input signals to a plurality 
3 0 of selectors, wherein each selector further receives one of 
31 a constant logic value and a mode signal; 



24 



TRI-003 



PATENT 



1 selectively providing one of the plurality of input 

2 signals, the constant logic value , and the mode signal on 

3 the internal logic block line, wherein the mode signal 

4 indicates whether the logic block is in a configuration 

5 mode or a user mode. 
6 

7 11. The method of Claim 10, wherein during configuration, 

8 the plurality of input signals, the constant logic value, 

9 and the mode signal have the same value. 
10 

11 12. The method of Claim 10, wherein during the user mode, 

12 the constant logic value and the mode signal have different 
^ 13 values . 

•5™ 

M> 15 13. The method of Claim 10, further including using the 

f% 16 mode signal to transfer the constant logic value to the 

00 17 internal logic block line. 

=31 
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1 



CONFIGURATION IN A CONFIGURABLE SYSTEM ON A CHIP 



2 



Brian Fox 



3 



Andreas Papaliolios 



4 



5 



ABSTRACT OF THE INVENTION 



6 



The present invention allows a user to customize the 



7 configuration sequence of a configurable system on a chip 

8 (CSoC) , thereby adding considerable flexibility to the 

9 configuration process. The present invention also provides 

10 certain features, transparent to the user, which optimize 

11 system resources and ensure the correct initialization of 

12 the CSoC. The CSoC leverages an on-chip central processing 

13 unit (CPU) to control the configuration process of the 

14 configurable system logic (CSL) . Advantageously, the CSL 

15 configuration memory cells as well as other programmable 

16 locations in the CSoC are addressable as part of a system 

17 bus address space. The system bus is a multi-use structure 

18 that can be used for both configuring and reading of memory 

19 cells. In this manner, the CSoC optimizes system 

20 resources . 
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